Learnings Options End-to-End for Continuous Action Tasks
نویسندگان
چکیده
We present new results on learning temporally extended actions for continuous tasks, using the options framework (Sutton et al. [1999b], Precup [2000]). In order to achieve this goal we work with the option-critic architecture (Bacon et al. [2017]) using a deliberation cost and train it with proximal policy optimization (Schulman et al. [2017]) instead of vanilla policy gradient. Results on Mujoco domains are promising, but lead to interesting questions about when a given option should be used, an issue directly connected to the use of initiation sets.
منابع مشابه
Considering a Model for Sustainable Energy Planning Under Uncertainty
In this paper, real options theory is utilized to evaluate the effect of uncertain electricity and CO2 costs on speculation conduct. Methodologically, the allegiance of the newspaper in this appreciation is that uncertainty is not just stopped down as far as stochastic processes and their fluctuation, additionally as far as expected and acknowledged procedures, i.e. the procedures, w...
متن کاملThe Position of Implicit and Indirect Learning in Ethical Education
The rationalist approach has been dominated on the education environment of values for many years. Unilateral and excessive focus on this approach has revealed ignorance of a vast part of human implicit learning and his intuitive ethical judgments which have been the source of many ethical behaviors. The study intends to answer the question that why only direct and deductive educations about va...
متن کاملContinuous control with deep reinforcement learning
We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic...
متن کاملA new reduced mathematical model to simulate the action potential in end plate of skeletal muscle fibers
Usually mathematicians use Hodgkin-Huxley model or FitzHug-Nagumo model to simulate action potentials of skeletal muscle fibers. These models are electrically excitable, but skeletal muscle fibers are stimulated chemically. To investigate skeletal muscle fibers we use a model with six ordinary differential equations. This dynamical system is sensitive to initial value of some variables so it is...
متن کاملAnalysis of the Coupled Nonlinear Vibration of a Two-Mass System
This paper presents a fixed-end two-mass system (TMS) with end constraints that permits uncoupled solutions for different masses. The coupled nonlinear models for the present fixed-end TMS were solved using the continuous piecewise linearization method (CPLM) and detailed investigation on the effect of mass-ratio on the TMS response was conducted. The investigations showed that increased mass-r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1712.00004 شماره
صفحات -
تاریخ انتشار 2017